This document will guide you through the steps to obtain an ecological niche model using MaxEnt with WorldClim climatic variables and GBIF records. The code is based on several sources including: https://github.com/danlwarren/ENMTools http://spatialecology.weebly.com/r-code--data/category/sdm-maxent https://cmerow.github.io/RDataScience/3_6_Teaching_Ecoinformatics.html
To begin with you need to install and load some packages. The lines to install and load packages have been annotated in case you already have them, if that is not the case you can delete the ‘#’ symbol at the beginning of the relevant line. If you are using MacOS and get errors when loading the library rJava go to the terminal and type ‘sudo R CMD javareconf’ and restart R or RStudio.
#install.packages(raster)
#library(raster)
#install.packages('rgdal')
#library(rgdal)
#install.packages('maps')
#library(maps)
#install.packages('mapdata')
#library(mapdata)
#install.packages('dismo')
#library(dismo)
#install.packages('rJava')
#library(rJava)
#install.packages('maptools')
#library(maptools)
#install.packages('jsonlite')
#library(jsonlite)
Now download the WorldClim Bioclimatic variables (check for more info https://www.worldclim.org/bioclim), in this case with a resolution of 2.5 minutes (c.4.5 km at the equator).
currentEnv=getData("worldclim", var="bio", res=2.5)
We are going to predict using an scenario (one of the many available; HadGEM2-ES,rcp85) of the Bioclimatic variables for 2070. Download the future climate layers with the same resolution and match the names of current and future climatic variables.
futureEnv=getData('CMIP5', var='bio', res=2.5, rcp=85, model='HE', year=70)
names(futureEnv)=names(currentEnv)
Also to avoid having to much data we are going to focus on just a few Bioclimatic variables. You can check what they actually are here https://www.worldclim.org/bioclim and use the ones you prefer. Notice that the listed variables in the code are the ones we are not using.
currentEnv=dropLayer(currentEnv, c("bio2", "bio3", "bio4", "bio10", "bio11", "bio13", "bio14", "bio15"))
futureEnv=dropLayer(futureEnv, c("bio2", "bio3", "bio4", "bio10", "bio11", "bio13", "bio14", "bio15"))
Now get the records from GBIF for the Pawpaw tree (Asimina triloba, Annonaceae) and store them in an object and get rid of records without geographic coordinates and duplicates. This tree species grows on natural areas on campus, but unless you are really good with bark characters you’ll probably have to wait a few months to spot it.
atraw=gbif('asimina','triloba')
## 6115 records found
## 0-300-600-900-1200-1500-1800-2100-2400-2700-3000-3300-3600-3900-4200-4500-4800-5100-5400-5700-6000-6115 records downloaded
at=subset(atraw, !is.na(lon) & !is.na(lat))
atdups=duplicated(at[, c("lon", "lat")])
at <-at[!atdups, ]
Plot your occurences for diagnostic purposes. Setting the extent of the map to 10 degrees beyond your maximum and minimum values of latitude and longitude.
data(wrld_simpl)
lonb<-round(c(min(at$lon)-10,max(at$lon)+10))
latb<-round(c(min(at$lat)-10,max(at$lat)+10))
plot(wrld_simpl, xlim=lonb, ylim=latb, axes=TRUE, col="light yellow")
points(at$lon, at$lat, col="orange", pch=20, cex=0.75)
Eliminate occurences outside the native range, Pawpaw trees are native to the east of the USA, so anything with longitude between 65ª and 105ª W and latitude above 25ªN is plausible in this case (GBIF has occurences of cultivated plants in other continents, poorly georeferenced occurences in the middle of the ocean).
at <- at[at$lon < -65 & at$lon > -105 & at$lat > 25 & at$basisOfRecord == "PRESERVED_SPECIMEN", ]
lonb<-round(c(min(at$lon)-10,max(at$lon)+10))
latb<-round(c(min(at$lat)-10,max(at$lat)+10))
Plot again to check the filtered occurences.
map('worldHires', xlim=lonb, ylim=latb, fill=TRUE, col='grey',border='black',lwd='0.2')
points(at$lon, at$lat, col="orange", pch=20, cex=0.75)
Now that you have the presence data you can see some of the climatic variables that you are going to include in the model. Like BIO1 that is Annual Mean Temperature in ªC for current climate.
colfunc<-colorRampPalette(c("royalblue","springgreen","yellow","red"))
plot(currentEnv[["bio1"]]/10,xlim=lonb, ylim=latb, main="Annual Mean Temperature in ªC", col=colfunc(50))
map('worldHires',xlim=lonb,ylim=latb, fill=FALSE, add=TRUE,col='grey')
points(at$lon, at$lat, pch="+", cex=0.4)
Or the same variable in the scenario of climate change for 2070.
plot(futureEnv[["bio1"]]/10,main="Future Annual Mean temperature in ºC", col=colfunc(50),xlim=lonb,ylim=latb)
map('worldHires',xlim=lonb, ylim=latb, fill=FALSE, add=TRUE,col='grey')
points(at$lon, at$lat, pch="+", cex=0.4)
Crop the environmental layers of current and future climate to the area where Pawpaws are native.
model.extent<-extent(c(lonb,latb))
modelEnv=crop(currentEnv,model.extent)
modelFutureEnv=crop(futureEnv, model.extent)
Now randomly split your dataset in some occurrences for training the model (the ones you use to create the ENM) and some for testing how well the model performs (some occurences that you didn’t include in the algorithm, but where a good model should be predicting suitable climate). In this case we are keeping 20% for testing because we have many occurrences.
atocc=cbind.data.frame(at$lon,at$lat)
fold <- kfold(atocc, k=5)
attest <- atocc[fold == 1, ]
attrain <- atocc[fold != 1, ]
Now run Maxent with the training occurences and the current environmental variables. You will get an error if you don’t have maxent installed, download it from the website specified in the error and save it in the directory that is also specified in the error.
at.me <- maxent(modelEnv, attrain)
Check the variables that helped the algorithm the most.
plot(at.me)
And the response curves of each variable for you dataset.
response(at.me)
Now that you created a model of the climatic niche of the Pawpaw trees you can predict if each cell in the whole study area has suitable climate for Pawpaws according to your model with current climate…
at.pred <- predict(at.me, modelEnv)
and see it on a map
plot(at.pred, main="Predicted Suitability")
map('worldHires', fill=FALSE, add=TRUE,col='grey',xlim=lonb,ylim=latb)
points(at$lon, at$lat, pch="+", cex=0.4)
Furthermore you can project the model to the layers with the future climate scenario…
at.2070 = predict(at.me, modelFutureEnv)
and plot it too.
plot(at.2070, main="Predicted Future Suitability")
map('worldHires', fill=FALSE, add=TRUE,col='grey')
points(at$lon, at$lat, pch="+", cex=0.4)
A good way to compare is to visualize the difference for the values on each cell for current vs future climate.
at.change=at.2070-at.pred
plot(at.change, main="Predicted Change in Suitability")
map('worldHires', fill=FALSE, add=TRUE,col='grey')
points(at$lon, at$lat, pch="+", cex=0.4)
Or building an histogram of the values for the difference of suitability on each cell that we used as occurrence data.
atChangePoints = extract(at.change, atocc)
hist(atChangePoints, main="", xlab="Change in habitat suitability")
abline(v=0, col="red")
We did not check how well our model performs with the test dataset. Evaluate your model with your test occurences and 1000 background pseudoabscences with the receiver operating characteristic curve (ROC curve) and it’s area under the curve (AUC).
bg <- randomPoints(modelEnv, 1000) #background "pseudoabsences"
e1 <- evaluate(at.me, p=attest, a=bg, x=modelEnv)
plot(e1, 'ROC')
Checking the variables for correlation is important. Download the following packages to obtain the package ENMTools.
#install.packages("devtools")
#library(devtools)
#install_github("danlwarren/ENMTools")
#library(ENMTools)
Now estimate the correlations coefficients among the variables you used for the model
raster.cor.matrix(modelEnv)
| bio1 | bio5 | bio6 | bio7 | bio8 | bio9 | bio12 | bio16 | bio17 | bio18 | bio19 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| bio1 | 1.0000000 | 0.9157535 | 0.9793219 | -0.8424893 | 0.6410225 | 0.8801279 | 0.1896500 | 0.2852680 | 0.0623840 | 0.1344129 | 0.1491180 |
| bio5 | 0.9157535 | 1.0000000 | 0.8432893 | -0.5760960 | 0.6505395 | 0.7540111 | -0.0297564 | 0.0345443 | -0.1010467 | -0.0937405 | -0.0246704 |
| bio6 | 0.9793219 | 0.8432893 | 1.0000000 | -0.9251256 | 0.5491258 | 0.9081366 | 0.2302545 | 0.3149690 | 0.1136095 | 0.1521556 | 0.2026153 |
| bio7 | -0.8424893 | -0.5760960 | -0.9251256 | 1.0000000 | -0.3755834 | -0.8484835 | -0.3711966 | -0.4546106 | -0.2441594 | -0.2976201 | -0.3255694 |
| bio8 | 0.6410225 | 0.6505395 | 0.5491258 | -0.3755834 | 1.0000000 | 0.3537701 | -0.0780296 | 0.1265574 | -0.2611904 | 0.1482476 | -0.2644505 |
| bio9 | 0.8801279 | 0.7540111 | 0.9081366 | -0.8484835 | 0.3537701 | 1.0000000 | 0.2363725 | 0.2657167 | 0.1882555 | 0.0768833 | 0.3258443 |
| bio12 | 0.1896500 | -0.0297564 | 0.2302545 | -0.3711966 | -0.0780296 | 0.2363725 | 1.0000000 | 0.8424374 | 0.8944831 | 0.8154404 | 0.8785945 |
| bio16 | 0.2852680 | 0.0345443 | 0.3149690 | -0.4546106 | 0.1265574 | 0.2657167 | 0.8424374 | 1.0000000 | 0.5457981 | 0.9338684 | 0.5678397 |
| bio17 | 0.0623840 | -0.1010467 | 0.1136095 | -0.2441594 | -0.2611904 | 0.1882555 | 0.8944831 | 0.5457981 | 1.0000000 | 0.5420595 | 0.9610172 |
| bio18 | 0.1344129 | -0.0937405 | 0.1521556 | -0.2976201 | 0.1482476 | 0.0768833 | 0.8154404 | 0.9338684 | 0.5420595 | 1.0000000 | 0.5256814 |
| bio19 | 0.1491180 | -0.0246704 | 0.2026153 | -0.3255694 | -0.2644505 | 0.3258443 | 0.8785945 | 0.5678397 | 0.9610172 | 0.5256814 | 1.0000000 |
And plot them with multidimensional scaling (MDS) and a heatmap
raster.cor.plot(modelEnv)
## $cor.mds.plot
##
## $cor.heatmap
Create a script that do the following and send it to ev243@cornell.edu:
a. Find records for a species you are interested in
b. Filter the records to the native range of your species
c. Pick uncorrelated and interesting variables and comment using # why you picked them
d. Create a model with current climate
e. Test your model and comment if it performs well
f. Use your ENM to predict to the future or past climate and comment your result
If you can not get it to work you can try the Wallace app (https://wallaceecomod.github.io/vignettes/wallace_vignette.html), with the following commands that will open a window of your internet browser with the app. Also find (Mann 510) or email me and ask for help if errors persist.
#install.packages('wallace')
#library(wallace)
#run_wallace()